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Abstract 

The time evolution of the local field in symmetric QTsing neural networks 
is studied for arbitrary Q. In particular, the structure of the noise and 
the appearance of gaps in the probability distribution are discussed. Re- 
sults are presented for several values of Q and compared with numerical 
simulations. 
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1 Introduction 

In a number of papers in the nineties (cfr. p|-|[10|| and references therein) the 
parallel dynamics of Q-Ising type neural networks has been discussed for several 
architectures -extremely diluted, layered feedforward, recurrent- using a proba- 
bilistic approach. For the asymmetric extremely diluted and layered architectures 
the dynamics can be solved exactly and it is known that the local field only con- 
tains Gaussian noise. For networks with symmetric connections, however, things 
are quite different. Even for extremely diluted versions of these systems feedback 
correlations become essential from the second time step onwards, complicating 
the dynamics in a nontrivial way. 

A complete solution for the parallel dynamics of symmetric Q-Ising networks 
at zero-temperature taking into account all feedback correlations, has been ob- 
tained only recently using a probabilistic signal-to-noise ratio analysis p[]-[^Q|. 
Thereby it is seen that both for the fully connected and the extremely diluted 
symmetric architectures, the local field contains a discrete and a normally dis- 
tributed noise part. The difference between the two architectures is that for the 
diluted model the discrete part at a certain time t does not involve the spins at all 
previous times t — 1, t— 2, ... up to but only the spins at time step t — 1. Even so, 
this discrete part prevents a closed-form solution of the dynamics but a recursive 
scheme can be developed in order to calculate the complete time evolution of the 
order parameters, i.e., the retrieval overlap and the activity. 

In the work above the focus has been on the non-equilibrium behavior of the 
order parameters of the network. But, since the local field itself is a basic in- 
gredient in the development of the relevant recursive scheme it is interesting to 
study also the non-equilibrium behavior of the local field distribution. The more 
so since this distribution does not convergence to a simple sum of Gaussians as is 
frequently thought, but it develops a gap structure. This is precisely one of the 
points studied in detail in the present communication. Moreover, the analogies 
and differences between the fully connected architecture and the symmetrically 
diluted one are highlighted. Finally, numerical simulations are presented confirm- 
ing the analytic study and giving additional insight in the structure of these local 
field distributions. 

2 The model 

Consider a neural network A consisting of N neurons which can take values 
Oi from a discrete set S = { — 1 = Si < s 2 < . . . < Sq = +!}• The p patterns 
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to be stored in this network are supposed to be a collection of independent and 
identically distributed random variables (i.i.d.r.v.), {£f £ S}, fj, £ V — {1, . . . , p} 
and i £ A, with zero mean, E[^] = 0, and variance A = Var[£f]. The latter 
is a measure for the activity of the patterns. Given the configuration er A (t) = 
{crj(t)}, j £ A = {1, ... , N}, the local field in neuron i equals 

hi(cr A (t)) = J2Mt>j(t) (i) 
j'eA 

with the synaptic coupling from neuron j to neuron i. In the sequel we write 
the shorthand notation fiA,i(t) = hi(a\(t)). 

For the extremely diluted symmetric (SED) and the fully connected (FC) 
architectures the couplings are given by the Hebb rule 

4 ED = ^E^ for i^j, Ji ED = o, (2) 

4° = for 4 C = 0, (3) 

with the {Qj = 0,1}, i,j £ A chosen to be i.i.d.r.v. with distribution Pr{cy = 
x} = (1 — C/N)5 X)0 + (C/N)5 Xt i and satisfying c^- = Cjj. 

For the diluted symmetric model the architecture is a local Cayley-tree but, 
in contrast with the diluted asymmetric model, it is no longer directed such that 
it causes a feedback from t > 2 onwards. In the limit A^ — > oo the probability 
that the number of connections Tj = {j £ A|cy = 1} giving information to the 
site i £ A, is still a Poisson distribution with mean C = E[\Ti\]. Thereby it is 
assumed that C \ogN and in order to get an infinite average connectivity 



allowing to store infinitely many patterns one also takes the limit C — > oo [10 



At zero temperature all neurons are updated in parallel according to the rule 
<7i(t + l) = g b (h A ,i(t)) 



Sk [0 [b(s k +i + s k )-x]-9 [b(s k + s k -i) - x\] (4) 

k=l 



with so = — oo and sq + \ = +oo. Here g b (-) is the gain function and b > is the 
gain parameter of the system. For finite Q, this gain function is a step function. 
The gain parameter b controls the average slope of g&(-). 
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3 Local field dynamics 

In order to measure the retrieval quality of the system one can use the Ham- 
ming distance between a stored pattern and the microscopic state of the network 

^,M*)) = ^£[£f-^)] 2 - (5) 

This introduces the main overlap and the arithmetic mean of the neuron activities 
<(*) = 4rEtf*-<(*)> /'^i a A (t) = W[^(t)} 2 ■ (6) 

i£A igA 

The key question is then how these quantities evolve in time under the parallel 
dynamics specified before. For a general time step we find from eq. (^) using the 
law of large numbers (LLN) that in the thermodynamic limit 

mty+l) P = jUhMt)))), = «g6(M*))», (7) 



where the convergence is in probability ||11|| . In the above ((•)) denotes the av- 
erage both over the distribution of the embedded patterns and the initial 
configurations (<Tj(0)}. The average over the latter is hidden in an average over 
the local field through the updating rule @). 

Some remarks are in order. For the symmetric diluted model the sum over the 
sites i is restricted to Tj, the part of the tree connected to neuron j. Moreover, 
for that model the thermodynamic limit contains the limit C — > oo besides the 
N — > oo limit. In this thermodynamic limit C, N — > oo all averages have to 
be taken over the treelike structure, viz. jfJ2ieA -^Hi^Tp an d the capacity 
defined by a = p/N has to be replaced by a = p/C. 

In (Uf) the local field is the main ingredient. Suppose that the initial con- 
figuration of the network {<7j(0)},i G A, is a collection of i.i.d.r.v. with mean 
E[<7j(0)] = 0, variance Var[<7j(0)] = a , and correlated with only one stored pat- 
tern, say the first one {£*}: 

E[£to(t))] = (8) 

with mj > 0. By the LLN one gets for the main overlap and the activity at t — 
m x (0) = lim m A (0) £ ^E[^(0)] = < (9) 

(C),N— >oo A 

o(0) = lim flA (0) £ E[a, 2 (0)] = a (10) 

(Gj,jV— »ac 
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where the notation should be clear. In order to obtain the configuration at t — 1 
we have to calculate the local field (Q) at t — 0. To do this we employ the 
probabilistic signal-to- noise ratio analysis ([[]]]-[|l0|])- Recalling the learning rule 
(H) we separate the part containing the signal from the part containing the noise. 
In the limit iV — > oo we then arrive at 

hi(0) = lim h A>i (0) = &W(0) + jV(0, aa(0)) (11) 

where the convergence is in distribution |TTJ and with A/"(0, V) representing a 
Gaussian random variable with mean and variance V. We note that this struc- 
ture of the distribution of the local field at time zero - signal plus Gaussian noise 
- is typical for all architectures treated in the literature. 

For a general time step t + 1, a tedious study reveals that the distribution of 



the local field is given by M, [10 



fH(t + l) = gm\t + 1) + JV(0, aa(t + 1)) + X (t) [F(hi(t) - tfm 1 (t)) + aa^t)] (12) 

where F — 1 for the fully connected architecture and F = for the symmetrically 
diluted one. So, the local field at time t consists out of a discrete part and a 
normally distributed part, viz. 

hi(t)=Mi(t)+Ar(0,V(t)) (13) 

where Mi(t) and V(t) satisfy the recursion relations 

Mi(t + 1) = X (t) [F(Mi(t) - £m\t)) + aa.it)} + gm\t + 1) (14) 
V{t + 1) = aa(t + I) A + F X 2 (t)V(t) + 2FaA X {t)Cov[r fl {t), r"(t)] . (15) 

The quantity x(t) reads 

X{t) = Yl fh»{t){K s k+i + s k )){s k+1 - s fc ) (16) 
k=i 

where f^tt) is the probability density of hf(t) in the thermodynamic limit. Fur- 
thermore, r M (t) is defined as 

r*{t) = j^m^^er^w, ^nw, (17) 
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and f^{t) is given by a similar expression with <Ji{t) replaced by g b (h\j(t) — 
-^=£fr^ (;£)). Finally, as can be read off from eq. ( [14]) the quantity Mj(t) consists 
out of a signal term and a discrete noise term, viz. 



t-2 

Mi(t) = £m l (t) + a X it - l)ai(t -1) + F^a 

t'=0 



t-i 

' Vi{t'). (18) 



Ilx(' 

.s=t> 



Since different architectures contain different correlations not all terms in these 
final equations are present, as is apparent through F. We remark that for the 
asymmetric diluted and the layered feedforward architecture Mi(t) = Qm}{t) so 
that in these cases the local field consists out of a signal term plus Gaussian noise 
for all time steps 0,0. 

For the architectures treated here we still have to determine the probability 
density fhi(t) in eq. (|T6|). This can be done by looking at the form of Mj(t) given by 
eq. (|T£|). The evolution equation tells us that (Ji(t') can be replaced by g b (hi(t'—l)) 
such that the second and third terms of Mj(t) are the sums of stepfunctions of 
correlated variables. These are also correlated through the dynamics with the 
normally distributed part of hi(t). Therefore, the local field can be considered as 
a transformation of a set of correlated normally distributed variables x s , which 
we choose to normalize. Defining the correlation matrix W = {p{s, s') = E[s s s s /]) 
we arrive at the following expression for fhdt) for the fully connected model 

f hi (t)(y) = [dxtfl dx s 8(y- Mi(t) - yjv(t) x t 

J s=0 ^ 

x 1 exp (--xW- 1 x T ) (19) 



det(2nW) 
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with x = {x s } = (x , • • • x t -2, %t)- For the symmetric diluted case this expression 
simplifies to 



f W2] ( i 

fhi(t)(y) = / II dx t-2s $[y- &m (t) - ax(t - l)o-j(t - 1) - \jaa{t)xt 

J s=0 



x 2 exp ( --x.W- 1 ^ ) (20) 



det(27riy) 
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with x = ({x s }) = (x t -2[t/2]i ■ ■ -Xt-2, Xt). The brackets [t/2] denote the integer 
part of t/2. 
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4 Gap structure 

The equilibrium distribution of the local field can be obtained by eliminating 
the time dependence in the evolution equations dl2|) 

hi = ilm 1 + 77vV(0, aa) + atXWi (21) 

with i] = 1/(1— x) f° r the fully connected architecture and rj = 1 for the extremely 
diluted one. The corresponding updating rule (|) 

0i = 9b(h~i + ®xWi) , h = gm} + r]M(0, aa) (22) 

in general admits more than one solution. A Maxwell construction (see, e.g., refs. 
PI , [fL0[] , |]I2"1) can be made leading to a unique solution 

Ci = 9i(hi), b=(b-^) (23) 

such that we have 

Oi = s k if b(s k + s k -i) + axvsk <h< b(s k + s k+1 ) + ax"qs k . (24) 

for b > 0. This unique solution can be used to obtain fixed-point equations for 
the main overlap and activity (0). Those equations which we choose not to write 
down explicitly here (see refs. @,[ll|) are equal to the equations derived from 



a thermodynamic replica-symmetric mean- field theory approach [|13l , ||14|| . We 
remark that for analog networks (Q — > oo) such a Maxwell construction is not 
necessary because eq. fl22|) has only one solution. 

Next, we calculate the probability density of the local field by plugging this 
result (|22] ) - (|24]) into (|2l|) to obtain, forgetting about the site index i and the 
pattern index 1 

1 / {h-£m-axns k y 

fW = 1^ fR eX Pl 



r q\ / 2naa \ 2aarf 



x ( 9[b(s k + s k+1 ) + axvsk - h] - 6[b(s k + s k -i) + axv s k ~ h]J (25) 

meaning that (Q-l) gaps occur respectively at b(s k + Sk-i) + axv s k-i < h < 
b(s k + Sfc+i) + OiX r l s k with width Ah = 2axv/(Q ~ !)• F° r analog networks 
no gaps occur. When b < the effective gain function ( p3"D becomes two-state 
Ising-like as in the Hopfield model such that case only one gap occurs. 
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For Q = 2 this expression simplifies to 



(h-£m- axn) 2 
r]\ / 2Tcaa " V 2aarj- 



f(h) = -^=eM~ V ' "ZZr" )0(h-c XV ) 



+ rR exp )9(-h-axv) 26 



and for Q=3 we have 



H k) = * exp(- - f - 6 - a X ,) 



1 ( (h — £m 



+ 7^ exp(- v o _2„2 ) g ( fc 



r]y/2iraa V 2aa?7 : 
+ ^ exp \6(-b - axn ~ h) . (27) 

Similar formula can be written down for bigger values of Q. For Q = 2 this result 
seems to be consistent with the gap in the internal-field distribution for an infinite 
range spin glass found by a Bethe-Peierls- Weiss approach [15] (see also [16|-|i7|). 



We have investigated this probability distribution numerically using the cor- 
responding fixed-point equations mentioned before, for several values of Q and 
compared them with those obtained from numerical simulations of the dynamics 
for networks of iV = 6000 neurons. Some typical results are shown in figs. 1-6. 

In figs. 1-2 the local field distribution for the fully connected Q = 2 network 
is shown for a retrieval state (a = 0.13, m = 0.5) just below the critical capacity 
and a non-retrieval spin-glass state (a = 0.14, m = 0.2) just above it. Both the 
first few time steps and the equilibrium result derived above are compared with 
numerical simulations. They are in agreement. For the retrieval state there is, 
typically, a small gap in the equilibrium distribution around h=0. For small a the 
gap is very narrow. Furthermore, in the simulations one sees that this gap shows 
up very quickly. For the non-retrieval state the gap is typically much bigger. 
Again in the simulations one quickly sees the gap but it is extremely difficult 
numerically to find points touching the zero axis because of finite size effects. 

Figure 3 shows the gap width at equilibrium, Ah, for the non-retrieval state as 
a function of Q with b = 0.5. It scales as Ah ~ 1/(Q — 1) and, hence, decreases to 
zero for Q — > oo. This constant behaviour of (Q — l)Ah attains already for values 
of Q > 20 and is also seen for the retrieval state. These results are insensitive to 
the structure of the symmetric architecture. 



8 



In figure 4 the gap boundaries in h as a function of a are compared for 
retrieval and non-retrieval states in the symmetric diluted Q = 3, b = 0.2 model. 
We remark that in this case the spin-glass states do not exist for a < 0.04 |L4| 
so that there is no gap for these a-values. For a large enough (a > 0.465 for 
retrieval states and a > 0.252 for spin-glass states) there exists one gap only since 
the effective gain function becomes Ising-like [TJ]]. More gaps with smaller widths 
are formed when increasing Q for both the fully connected and diluted models. 
For Q — > oo the gaps disappear. 

Figure 5 compares the gaps for the spin-glass states in the fully connected 
and symmetric diluted Q = 3 models with b = 0.5. For a < 0.25 there exist no 
spin-glass states in the diluted model [14| and for a < 0.004 there are none in 
the fully connected model |]13|] . When both do exist the gap widths are almost 
equal. So the dilution has some influence on the existence of the gap but, again, 
not on its width. 

Finally, fig. 6 presents the local field distribution for the symmetric diluted 
Q = 3,b = 0.5 model for a retrieval state (a = 0.6, m = 0.7) just below the 
critical capacity. Only the distribution with pattern values +1 is shown. It is 
asymmetric and two gaps are found at equilibrium. For pattern values the 
distribution is symmetric and the gap locations and widths are the same (see 
eq. (|25"D) but their height is different. 

In conclusion, we have studied the time evolution of the local field in sym- 
metric Q-Ising neural networks both in the retrieval and spin-glass regime. We 
have found a gap structure in the local field distribution depending on the spe- 
cific architecture and on the value of Q. The results agree with the numerical 
simulations we have performed. 



Acknowledgments 

This work has been supported in part by the Fund of Scientific Research, 
Flanders-Belgium and the Korea Science and Engineering Foundation through 
the SRC program. The authors are indebted to A. Coolen, G. Jongen and V. Za- 
grebnov for constructive discussions. 



References 



[1] A.E. Patrick and V.A. Zagrebnov, Parallel dynamics for an extremely diluted 
neural network, J. Phys. A: Math. Gen. 23: L1323 (1990); J. Phys. A: Math. 



9 



Gen. 25: 1009 (1992). 

[2] A.E. Patrick and V.A. Zagrebnov, On the parallel dynamics for the Little-Hopfield 
model, J. Stat. Phys. 63: 59 (1991). 

[3] T.L.H. Watkin and D. Sherrington, The parallel dynamics of a dilute symmetric 
neural network, J. Phys. A: Math. Gen. 24: 5427 (1991). 

[4] A.E. Patrick and V.A. Zagrebnov, A probabilistic approach to parallel dynamics 
for the Little-Hopfield model, J. Phys. A: Math. Gen. 24: 3413 (1991). 

[5] D. Bolle, B. Vinck, and V.A. Zagrebnov, On the parallel dynamics of the Q-state 
Potts and Q-Ising neural networks, J. Stat. Phys. 70: 1099 (1993). 

[6] D. Bolle, G.M. Shim, B. Vinck, and V.A. Zagrebnov, Retrieval and chaos in ex- 
tremely diluted Q-Ising neural networks, J. Stat. Phys. 74: 565 (1994). 

[7] D. Bolle, G.M. Shim, and B. Vinck, Retrieval and chaos in layered Q-Ising neural 
networks, J. Stat. Phys. 74: 583 (1994). 

[8] D. Gandolfo, M. Sirugue-Collin and V.A. Zagrebnov, Local instability and oscilla- 
tions of trajectories in a diluted symmetric neural network, Network: Computation 
in Neural Systems 9: 563 (1998) 

[9] D. Bolle, G. Jongen and G.M. Shim, Parallel dynamics of fully connected Q-Ising 
neural networks, J. Stat. Phys. 91: 125 (1998). 

[10] D. Bolle, G. Jongen and G.M. Shim, Parallel dynamics of extremely diluted sym- 
metric Q-Ising neural networks, J. Stat. Phys. 96: 861 (1999). 

[11] A.N. Shiryayev, Probability (Springer, New York, 1984). 

[12] M. Shiino and T. Fukai, Self-consistent signal-to-noise analysis of the statistical 
behavior of analog neural networks and enhancement of the storage capacity, Phys. 
Rev. £48: 867 (1993). 

[13] D. Bolle, H. Rieger and G.M. Shim, Thermodynamic properties of fully connected 
Q-Ising neural networks, J. Phys. A: Math. Gen. 27: 3411 (1994). 

[14] D. Bolle, D. Carlucci and G.M. Shim, Thermodynamic properties of extremely 
diluted Q-Ising neural networks, J. Phys. A: Math. Gen. 33: 6481 (2000). 

[15] L.J. Schowalter and M.W. Klein, Analytic treatment of the hole in the internal 
field distribution for an infinite-range spin glass, J. Phys. C: Solid State Physics 12: 
L935 (1979). 



10 



[16] V.A. Zagrebnov and A.S. Chvyrov, The Little-Hopfield model: recurrence relations 
for retrieval-pattern errors, Sov.Phys.JETP 68: 153 (1989) 

[17] A.C.C. Coolen and D. Sherrington, Order parameter flow in the fully connected 
Hopfield model near saturation, Phys. Rev. E 49: 1921 (1994). 



0"O. 




-1 1 



h 

Figure 1: A comparison of theoretical results and numerical simulations with N = 6000 
for the local field distribution f(h) of a retrieval state in the Q = 2 system with network 
parameters a = 0.13, tjiq = 0.5. Theoretical (simulation) results for time step t = 0,1,2 
are indicated by a dotted curve (circles), a short-dashed curve (squares) and a long- 
dashed curve (diamonds). Simulations for t = 10,20 (stars, triangles) are shown and 
the full curve presents the equilibrium distribution. 
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Figure 2: As in Fig. 1, for a Q = 2 non-retrieval spin- glass state with the network 
parameters a = 0.14, mo = 0.2. Further simulations for t = 10 (stars), t = 30 (crosses), 
t = 50 (filled circles) and t = 100 (filled squares) are shown. 
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Figure 3: The gap width Ah for non-retrieval states as a function of Q for the gain 
parameter b = 0.5 for a = 1 (triangles), a = 0.1 (squares) and a = 0.01 (filled circles). 
The inset details the corresponding scaling properties. 
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Figure 4: The gap boundaries in h as a function of a for retrieval (full curve) and 
non-retrieval (dashed curve) states for the Q = 3 symmetric diluted systems with gain 
parameter b = 0.2. 
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Figure 5: The gap boundaries in h as a function of a for spin-glass states in the 
fully connected (short-dashed curve) and symmetric diluted (long-dashed curve) Q = 3 
system with gain parameter b = 0.5. 
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Figure 6: The local field distribution f(h) of a retrieval state for pattern values +1 in 
the symmetric diluted Q = 3 system with network parameters a = 0.6, b = 0.5, tuq = 
0.7. Results for t = 0, 1, 2, oo are indicated by a dotted curve, a short-dashed curve, a 
long-dashed curve and a full curve respectively. 



